Early MFCC and HPCP Fusion for Robust Cover Song Identification
نویسنده
چکیده
While most schemes for automatic cover song identification have focused on note-based features such as HPCP and chord profiles, a few recent papers surprisingly showed that local self-similarities of MFCC-based features also have classification power for this task. Since MFCC and HPCP capture complementary information, we design an unsupervised algorithm that combines normalized, beatsynchronous blocks of these features using cross-similarity fusion before attempting to locally align a pair of songs. As an added bonus, our scheme naturally incorporates structural information in each song to fill in alignment gaps where both feature sets fail. We show a striking jump in performance over MFCC and HPCP alone, achieving a state of the art mean reciprocal rank of 0.87 on the Covers80 dataset. We also introduce a new medium-sized hand designed benchmark dataset called “Covers 1000,” which consists of 395 cliques of cover songs for a total of 1000 songs, and we show that our algorithm achieves an MRR of 0.9 on this dataset for the first correctly identified song in a clique. We provide the precomputed HPCP and MFCC features, as well as beat intervals, for all songs in the Covers 1000 dataset for use in further research.
منابع مشابه
Cover Song Identification Based on Similarity Fusion
We describe a similarity fusion based cover song identification scheme. The Harmonic Pitch Class Profile (HPCP) is chosen as the musical descriptor. First, the similarity between HPCP descriptors of two songs are obtained based on Qmax function and Dmax function, respectively. Then these two similarities are fused via Similarity Network Fusion (SNF) technique, which was originally proposed for ...
متن کاملTwo-layer similarity fusion model for cover song identification
Various musical descriptors have been developed for Cover Song Identification (CSI). However, different descriptors are based on various assumptions, designed for representing distinct characteristics of music, and often differ in scale and noise level. Therefore, a single similarity function combined with a specific descriptor is generally not able to describe the similarity between songs comp...
متن کاملImproving the performance of MFCC for Persian robust speech recognition
The Mel Frequency cepstral coefficients are the most widely used feature in speech recognition but they are very sensitive to noise. In this paper to achieve a satisfactorily performance in Automatic Speech Recognition (ASR) applications we introduce a noise robust new set of MFCC vector estimated through following steps. First, spectral mean normalization is a pre-processing which applies to t...
متن کاملCover Song Identification with Timbral Shape Sequences
We introduce a novel low level feature for identifying cover songs which quantifies the relative changes in the smoothed frequency spectrum of a song. Our key insight is that a sliding window representation of a chunk of audio can be viewed as a time-ordered point cloud in high dimensions. For corresponding chunks of audio between different versions of the same song, these point clouds are appr...
متن کاملThe Mtg Submission to Mirex 2008 Audio Cover Song Identification Task
This is an extended abstract that overviews our cover song identification system as submitted to the MIREX 2008 Audio Cover Song Identification task. The system is developed from our 2007 MIREX submission but including some important modifications and parameter tuning. For time reasons, the present document is just a very early draft version. Further versions will be available soon.
متن کامل